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ABSTRACT 

Data distribution, degree of data replication, and transaction 
access patterns are key factors in determining the performance 
of distributed database systems. In order to simplify the evalua- 
tion of performance measures, database designers and researchers 
tend to make simplistic assumptions about the system. In tins 
paper, we investigate the effect of modeling assumptions on the 
evaluation of one such measure, the number of transaction roll- 
backs, in a partitioned distributed database system. We develop 
six probabilistic models and develop expressions for the nuin er 
of rollbacks under each of these models, essentially, the models 
differ in terms of the available system information. The analyti- 
cal results so obtained are compared to results from simulation. 
From here, we conclude that most of the probabilistic models 
yield overly conservative estimates of the number of rollbacks. 
The effect of transaction commutativity on system throughput is 
also grossly undermined when such models are employed. 

1. INTRODUCTION 

A distributed database system is a collection of cooperating 
nodes each containing a set of data items (In tins paper, the 
basic unit of access in a database is referred to as a data item.). 
A user transaction can enter such a system at any of these nodes. 
The receiving node, sometimes referred to as the coordinating or 
initiating node, undertakes the task of locating the nodes that 
contain the data items required by a transaction. 

A partitioning of a distributed database (DDB) occurs when 
the nodes in the network split into groups of communicating 
nodes due to node or communication link failures. 1 he nodes 
in each group can communicate with each other, but no node in 
one group is able to communicate with nodes in other groups. We 
refer to each such group as a partition. The algorithms which al- 
low a partitioned DDB to continue functioning generally fall into 
one of two classes [Davidson et al. 1985], Those in the first class 
take a pessimistic approach and process only those transactions 
in a partition which do not conflict with transactions in other par- 
titions, assuring mutual consistency of data when partitions are 
reunited. The algorithms in the second class allow every group 
of nodes in a partitioned DDB to perform new updates. Since 
this may result in independent updates to items in different par- 
titions, conflicts among transactions are bound to occur, and the 
databases of the partitions will clearly diverge. Therefore, they 

require a strategy for conflict detection and resolution. Usually, 
rollbacks are used as a means for preserving consistency; con- 
flicting transactions arc rolled hac k when pol l i 1 it jii s are reunited. 
Since coordinating the undoing of transact ions is a very dillicult 
task, these methods are called optimistic since* they are useful 
primarily in a situation where the number ol items in a par- 
ticular database is large and the probability of conflicts among 
transactions is small. 

In general, determining if a transaction that successfully ex- 
ecuted in a partition is rolled back al the time the database 
is merged depends on a number of factors. Data items in the 
read-set and the write-set of the transaction, the* distribution of 
these data items among the other partitions, access patterns of 
transactions in other partitions, data dependencies among the 
transactions, and semantic relation (if any' 1 1 -’tween these trans- 
actions are some examples of these factors. Exact evaluation of 


rollback probability for all transactions in a database (and hence 
the evaluation of the number of rolled back 1 1 ansae lions) gen- 
erally involves both analysis and simulation, and requires large 
execution times (Davidson 1982; Davidson 1 ‘IS I ). To overcome 
the computational complexities of evaluation, designers and re- 
searchers generally resort to approximation techniques (David- 
son 1982; Davidson 1986; Wright 1983a; Wright 1983b]. These 
techniques reduce the computation tune by making simplifying 
assumptions to represent the underlying distributed system. I be 
time complexity of the resulting techniques greatly depends on 
the assumed model as well as evaluation tec hmqiies. 

In this paper we are interested in determining t lie effect of the 
distributed database models on the computational complexity 
and accuracy of the i oil ba< k statistics in a pa 1 1 it toned dal abase. 

The balance of this paper is outlined as lolluws. Se ction 2 for- 
mally defines the problem under consideration. In Section 3, we 
discuss the data distribution, replication, and transaction model- 
ing, Section t derives the* rollback statistics lor one distribution 
model. In Section 5, we compare tin* analysis met bods for six 
models and simulation method lor one model based on computa- 
tional complexity, space complexity, and accuracy ol the measure. 
Finally, in Section 6, we summarize the obtained insults. 


2. PROBLEM DESCRIPTION 

Even though a transaction I\ in partition 1\ may be rolled 
back (at merging time) by another transaction 1 * in partition 1\ 
due to a number of reasons, the* following two cases arc* found to 
be the major contributors [Davidson 1982]. 

i. P\ f and there is at least one data item which is up- 
dated by both 7, and This is referred to as a write- write 
conflict 

d = p iy T-t is rolled back, and it is a dependency parent of 
(i.e., Y, has read al least one data item updated by 7 2 , 
and Ti occurs prior to 7’, in the serialization sequence). 

The above discussion on reasons for rollback only considers 
the syntax of transactions (i.e. read- and write-sets) and does 
not recognize any semantic relation between them. To be more 
specific, let us consider transactions T\ and T 2 executed in two 
different partitions 1\ and respectively. Let us also assume 
that the intersection between the write-sets of l y and i 2 is non- 
empty. Clearly, by the above definition, there is a write write 
conflict and one of the* two transactions has to be rolled back. 
However, if 7j and V 2 commute with each other, then there is no 
n«*<‘d to rollback eit her of t In* transactions at the tune of partition 
merge Carcia-Mutma 1983; Jajodia and Speckman 1985; Jajodia 
and Mukkamala 1990). Instead, 7\ needs to be executed in 1 2 
and 7? needs to he executed in /V 1 he analysis in this paper 
take this property into account. 

In order to compute the number of rollbacks, it is also nec- 
essary to define some ordering (O(P)) on the partitions, for 
example, if T, ami I', correspond to case (i) above, and do not 
commute, it is necessary to determine which of these two aie 
rolled hack at tin- time of merging. Partition ordering resolves 
this ambiguity by the following rule: Whenever two conf iding 
but non-commuting transactions are executed in two di Horen t 
partitions, then the transaction executed in the lower order par- 
tition is rolled hack. 
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Since a transaction may be rolled back due to either (i) or 
(ii), we classify the rollbacks into two classes: Class l and Class 
2 respectively. The problem of estimating the number of roll- 
backs at the time of partition merging in a partially replicated 
distributed database system may be formulated as follows. 

Given the following parameters, determine the number of 
rolled back transactions in class I (/?,) and class 2 (/? 2 ), 

• 7i, the number of nodes in the database; 

• d, the number of data items in the database; 

• p, the number of partitions in the distributed system (prior 
to merge); 

• f, the number of transaction types; 

• GO, the global data directory that contains the location of 
each of the d data items; the (if) matrix has d rows and n 
columns, each of which is either 0 or l; 

• NSk, the set of nodes in partition A:, VA; = 1,2,..., p; 

• RSj, the read-set of transaction type j t j = 1,2 

• IVSj, the wrile-scl of transaction type j, j — 1,2, ... ,1; 

• N } k, the number of transactions of type j received in par- 
tition A- (prior to merge), j — 1 , 2, . . . , f, k - 1 , 2, . . . , p. 

• C M , the commutativity matrix that defines transaction 
commutativity. If C\W ; , = true then transaction types jj 
and )i commute. Otherwise they do not commute. 

The average number of total rollbacks is now expressed as R = 

R\ + Rj. 

3. MODEL DESCRIPTION 

As stated in thr introduction, the primary objective of this 
paper is to investigate the effect of data distribution, replication, 
and transaction models on estimation of t he number of rollbacks 
in a distributed database system. 

To describe a data distribution-transaction model, wc char- 
acterize it with three orthogonal parameters: 

1. Degree of data item replication (or the number of copies). 

2. Distribution of data item copies. 

3. Transaction characterization 

Wc now discuss each of these parameters in detail. 

For simplicity, several analysis techniques assume that each 
data item has the same number of copies (or degree of replica- 
tion) in the database system (Colfman et al. 1981). Some other 
techniques characterize the degree of replication of a database by 
the average degree of replication of data items in that database 
[Davidson 1986). Others treat the degree of replication of each 
data item independently. 

Some designers and analysts assume some specific allocation 
schemes for data item (or group) copies (e.g., [Mukkamala 19871). 
Assuming complete knowledge of data copy distribution (G/)) 
is one such assumption. Depending on the type of allocation, 
such assumptions may simplify the performance analysis. Others 
assume that each data item copy is randomly distributed among 
the nodes in the distributed system (Davidson 1986). 

Many database analysts characterize a transaction by the size 
of its read-set and its write-set. Since different transactions may 
have different sizes, these are either classified based on the sizes, 
or an average read-set size and average writc-sct size are used to 
represent a transaction. Others, however, classify transactions 
based on the data items that they access (and not necessarily on 
their size). In this case, transaction types are identified with their 
expected sizes and the group of data items from which these are 
accessed. An extreme example is a case where each transaction in 
the system is identified completely by its read-set and its write- 


SCl. 

With these three parameters, we can describe a number of 
models. Due to the limited space, we chose to present the results 
for six of these models in this paper. 

Wc chose the following six models based on their applicability 
in the current literature, and their close resemblance to practical 
systems. In all these models, the rate of arrival of transactions 
at each of the nodes is assumed to be completely known a priori. 
We also assume complete knowledge of the partitions (i.e. which 
nodes are in which partitions) in all the models. 

Model 1: Among the six chosen models, this has the max- 
imum information about data distribution, replication, and 
transactions in the system. It captures the following infor- 
mation. 

• Replication: Data replication is specified for each data 
item. 

• Data distribution: The distribution of data items among 
the nodes in the system is represented as a distribution 
matrix (as described in Section 2). 

• Tinnsactwns: All distinct transactions executed in a 
system are represented by their read-sets and write- 
sels. Thus, for a given transaction, the model knows 
which data items are lead, and which data items are 

updated. The commutativity information is also com- 
pletely known and is expressed as a matrix (as de- 
scribed in Section 2). 

Model 2: This model reduces the number of transactions 
by combining them into a set of transaction types based on 
commutativity, commonalities in data access patterns, etc. 
Since the transactions are now grouped, some of the indi- 
vidual characteristics of transactions (e g. the exact read- 
set and writes-set) are lost. This model has the following 
information. 

• Replication: Average degree of replication is specified 
at the system level. 

• Data distribution: Since the read- and write-set infor- 
mation is not retained for each transaction type, the 
data distribution information is also summarized in 
terms of average data items. It is assumed that the 
data copies are allocated randomly to the nodes in the 
system. 

• Tmnsactions: A transaction type is represented by 
its read-set size, write-set size, and the number of 
data items from which selection for read and write 
is made. Since two transaction types might access the 
same data item, it also stores this overlap information 
for every pair of transaction types. The commutativ- 
ity information is stored for cacli pair of transaction 
types. 

Model 3: This model further reduce the transaction types 
by grouping them based only on commutativity character- 
istics. No consideration is given to commonalities in data 
access pattern or differing read-set and write-set sizes. It 
has the following information. 

• Replication: Average degree of replication is specified 
at the system level. 

• Data distribution: As in model 2, it is assumed that 
the data copies are allocated randomly to the nodes 
in the system. 

• Transactions: A transaction type is represented by 
the average read-set size and average write-set size. 
The commutativity information is stored for all pairs 
of transaction types. 

Model 4: This model classifies transactions into three 

types: read-only, read-write, and others. Read-only trans- 
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actions commute among themselves. 

lions neither commute among themselves nor “"‘LL ns 
others The others class corresponds to update transact! 
that may or may not commute with transactions m their 
own class. This fact is represented by a commute proba 
ity assigned to it. 

• Replication: Average degree of replication is spec.fted 
at the system level. 

• Data distribution: As in model 2, it is assumed that 
the data copies are allocated randomly to the nodes 
in the system. 

• Transactions: Read-only class is represented by aver- 
age reld set size. The read -write class » represented 
bv average read-set and wnte-set sizes. The others 
class is represented by the average read-set size aver- 
age write- set size and the probability of commutation. 

Model 5- This model reduces the transactions to two 
classes- read-only and read-write. Read-only transactions 
commute among themselves. The read-write transactions 
corresponds to update transactions that may of jmy I 
commute will, transactions m their own class. This fact 
represented by a commute probability assigned to it. 

. Replication: Average degree of replication is specified 
at the system level. 

• Data distribution: As in model 2, it is assumed that 
t) ie data copies are allocated randomly to the nodes 
in the system. 

. Transactions: Read-only class is represented by ’ aver- 
age read-set size. The read-write class is represented 
by average- read-set and write- set sizes, ami the l»ob 
ability of coinmulal ion. 

Model 6: This model identifies read-only transactions and 
Xr update transactions. But these two types have the 
fine average read-set size. Update transactions may or 
may not commute with other update transactions. 

• Replication: Average degree of replication is specified 
at the system level. 

• Data distribution: As in model 2, it is assumed that 
the data copies arc* allocated randomly to the nodes 
in the system. 

• Tmnsactions: The read-set size of a transaction is de- 
noted by its average. For update transactions, we also 
associate an average write-set size and the probability 
of commutation. 

. .1 model 1 is very general, and assumes complete 

Among these mo e is very g ljcatioll alld l, ansae- 

information of data distribuli . ' , ' avcraK e) information 

lions Other models assume only p,iitial(or average, 
about data distribution and replication. Model l lias the most 
information and model 6 has the least. 

4. COMPUTATION OF THE AVERAGES 

Several approaches offer potential for computing the average 
» e r „i||vjw.b s for a itiven system environment, the most 

Sssss SSSig%i 
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times to get* the required confidence m the final result. 


Probabilistic analysis is especially useful when interest is con- 
fined to deriving Hie average behavior of a system from * .B ,ve 
model Generally it requires less computation time. In this p 
™ we present detailed analysis for model 6. and a summary of 
the analysis for models 1-5. 

4.1 Derivations for Model 6 

This model considers only two transaction types: read-only 
(Type 1 / and read-write (T ype >). Both have t fie same av. .rage 
r..A, set size of r A read-write transaction updates w of the data 
Rems that it reads. N, k and N n represent the rate of arrival of 

types l and 2 respectively at partition k. The average degree 
of replication of a data item is given as c. 1 he system has n 
nodes and d data items. The probability that two read-wnte 

tra, m 1 ^XXLXb.trary transaction T, received at one 
of the nodes in partition k with m nodes. Since the copies, .of 
a data item are randomly distributed among the n nodes, the 
probability that a single data item is accessible in partition k 

given by 


= l 


(:) 


in 


Since each data item is independently allocated the expected 
number of data items available in this partition is < Ui k . Sl , '' ^ - V ; 
since T, accesses r data items (on the average) the probability 
that it will be successfully executed is «[• pom here, the : numbe 
of successful transactions in k is eslm.ated as n k ,V lt and « k . » 

J:,:z .to ;.i.!a.-i. »•! -■ 

the write- set of / , «iul uol cumtmiting with 7,. 1 he piobabil > 
that a given data item (updated by 7.) is not updated in another 
partition k' by it non-commuting transaction (with resp i) 

is given by 


lh’ 


- (■-&) 




( 2 ) 


Given that a data item is available in k, probability that it is 
not available in k f is given as 


I(t-.C) 


("-•-) - (-rl 
"*(") 


(3) 


From here the probability that a data item available in k is not 
updated any other transaction higher order partitions IS given 

as 


— 


q b(t-,C) + (1 0) 




The probability that transaction T , is not in write-write con- 
Diet with any other non-commuting transaction of higher-or e 
partitions is now given as 





(5) 


Fr.mi here, llu- ..umber -f transactions ndled bark due to category 

(i) oV^rfe - r 1 % de k c of 

mine tl“ ^.ability that V, is rolled back due to the rollback of 

Ira ns act ion' 'in par' l i'uo ' 1 MlLL^LiLLbililY U.at 7', depends 
on 7 2 (i-e. read- write conflict) is given by. 
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\ k = 



( 6 ) 


The probability that 7', is not rolled bark due to the roll back of 
any of its dependency parents is now given by: 


U 


j\\ 

E 


( Afcpt 4 1 - M 


( 7 ) 


whore A\ = jVjit 4- N?k and u — A'ja-/( AA* T Am). 

The total number of rolled back transactions due to category 
(ii) is now estimated as R? ~ 2Zfc=i( \*.) a it( Aq* T l l k^2k)’ ^ 
total number of rolled back transactions is II — Ri 4 Ri 


5. COMPARISON OF THE MODELS 

As mentioned in the introduction, the main objective of this 
paper is to determine the effect of data distribution, replication, 
and transaction models on t lie estimation of rollbacks. Io achieve 
this, we evaluate the desired measure using six different, data 
distribution and replication models. The comparison of these 
evaluations is based on computational time, storage requirement, 
and the average values obtained. 

Due to the limited spar e, we could not present I he detailed 
derivations for the average values for models 2 6. the final ex- 
pressions, however, are presented in [Mukkamala 1 !JDOJ . 


5.1 Computational Complexity 


• Model 3, in addition to the space required by models 4- 
f>, also requires Off 2 ) for commutativity matrix. I bus it 
requires (){ni 4 f 2 ) space. 

• Model 2, in addition to the space required by model 3, 
also requires t* spar e to store the data overalp information. 
Thus, it requires 0{nt 4 t 2 ) storage. 

Thus, model l has the largest storage requirement and model 6 
has the least. 

5.3 Evaluation of the Averages 

In order to compare the effect of each of these models on 
the evaluation of the average rollbacks, we have run a number of 
experiments. In addition to the analytical evaluations for models 
1-6, we have also run simulations with Model I. The results 
from these runs are summarized in Tables 1-7. Basically these 
tables describe the number of transactions successfully executed 
before partition merge {Before Merge), number of rollbacks due 
to class L {Ii i), rollbacks due to class 2 (/?j), and transactions 
considered to be successful at the completion of merge {Aftei 
Mr rye ) . Obviously, the last term is computed from the earlier 
three terms. In all these tables, the total number of transaction 
arrivals into the system during partitioning is taken to be 65000. 
Also, each nodi’ is assumed to receive equal share of the incoming 
transactions. 

• Table 1 summarizes the effect of number of partitions as 
measured with Models 1-6. Here, it is assumed that each 
of the data items in the system has exactly c — 3 copies. 
The other assumptions in models 1-6 are as follows: 


We now analyze each of the evaluation methods (for models 
1-6) for their computational complexity. 

• In model 1, all t transactions are completely specified, and 
the data distribution matrix is also known. To determine 
if a transaction is successful, we need to the scan the dis- 
tribution matrix. Similarly, determining if a transaction in 
a lower order partition is to be rolled back due to a write- 
write conflict with a transaction of higher order partition 
requires comparison of write- sets of the two transactions. 
Determining if a transaction needs to he rolled back due to 
the rollback of a dependency parent also requires a search. 
All this requires (){iult 4- p*t* 4 pt 2 S), where t is the num- 
ber of transaction types and ,V is the maximum number of 
transactions executed in a partition prior to the merge . 

• Models 2-6 haveas*nilai computation structure. The num- 
ber of transaction types (/) is high for model 2 and low for 
model 6. Each of these models require 0(p 2 t 2 c -4 pt 2 A r ) 
time. As before, l is the number of transaction types and 
N is the maximum number of transactions executed in a 
partition prior to the merge. 

Thus, model I is the most complex (computationally) and model 
6 is the least complex. 

5.2 Space Complexity 

We now discuss the space complexity of the six evaluation 
methods: 

• Model 1 requires 0{dn) to store the data distribution ma- 
trix, O(n) to store the partition information, O(dt) to store 
the data access information, and 0(nf) to store the trans- 
action arrival information. It also requires 0(f 2 } to store 
the commutativity information. Thus, it requires 0{dn + 
dt 4- nt 4 f 2 ) space to store model information. 

• Models 4 6 require similar information: 0{t) to store the 
average size of read- and write- sets of transaction types, 
O(ni) for transaction arrival, O(n) for partition informa- 
tion, and 0(0 for commute information. Thus they require 
0{nt) space. 


1. Model 1 considers 130 transaction types in the sys- 
tem. Each is described by its read- and write-sets and 
whether it commutes with the oilier transactions. 00 
of the 130 are read-only transactions. The rest of the 
40 are read write. Among the read-write, 15 commute 
with each other, another 10 commute with each other, 
and the rest of the 15 do not commute at all. The sim- 
ulation run takes the same inputs but evaluates the 
averages by simulation. 

2. Model 2 maps the 130 transaction types into 4 classes. 
To make the comparisons simple, the above four classes 
(90415410+15) are taken as four types. The data 

overlap is computed from the information provided in 
model 1 . 

3 Model 3, to facilitate comparison of results, considers 
the above 4 classes. This model, however, does not 
capture the data overlap information. 

4. Model 4 considers three types: read-only, read-write 
that commute among themselves with some probabil- 
ity, and read-write that do not commute at all. 

5. Model 5 considers read-only transactions with read-set 
size of 3 and read -write transactions with read-set size 
of 6. Read write transactions commute with a given 
probability. 

6. Model 6 only considers t he average read-set size (com- 
puted as 4 in our case), the portion of read-write trans- 
actions (=45/130), and the average writc-sct size for 
a read-write (= 2). Probability that any two transac- 
tions commute is taken to be 0.4. 

From Table 1 it may be observed that: 

• The analytical results from analysis of Model 1 is a 
close approximation of the ones from simulation. 

• The evaluation of number of successful transactions 
prior to the merge is well approximated by all the 
models. Model 6 deviated the most. 

• The difference in estimations of R\ and Rj * s signif- 
icant across the models. Model 1 is closest to the 
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simulation Model 6 has the worst accuracy Model 
5, surprisingly, is somewhat better than Models , , 
and 6. 

. The estimation of II, from models 2-6 i. about 50 
times of the estimation from Model 1. 1 he tst tna 

lions from Model 1 and the simulation are quite close. 

From here, can see that, Models 2 6 yield overly 
conservative estimates of the number of rollba ks at 
the time of partition merge. While Model 1 estimated 
the rollbacks as 1200, Model 2-6 have approximated 
them as about IdUOL). 

• This (inference in estimations seems to exist even w len 
the number of partitions is increased. 

Table 9 summarizes the effect of number of copies on the 
evaVuario: accuracies of the models. It may be observed 
that 

. The difference between evaluations from Model 1 and 
the others significant at low (c = 3 I as well as high 
(c = 8) values 6 of c. Clearly, the difference is more 
significant at high degrees of replication. 

_ rasP Dl - 4 n 7 - 6,c = 8 corresponds to a case 

where each of the 500 data items is available »n both 
U,e partitions. This is also evident from the act that 
all the 650UU input Iransiictioiis are successful pilot 
the merge. . , 

. The results from the analysis and simulation of Model 
1 are close to those Irom simulation. 

. Table 3 shows the effect of increasing the > ° f ? |J° d £ 

c ift r... i \ Lo *>0 lor large values ol n, all lut six 

models result in good approximations of successful trans- 
actions prior to mtrge. The differences in estimations of «, 
and Hi still persist. 

. Table 1 compares models !> and <> While model 6 only r<" 

:;;:::u.i6k;^ 

s±s * 

and III 111 addition, the effect of commutativity on fti and 
« is not evident until m > O.W. This is counterintuitive^ 
The simplistic nature of the models is the real cause of this 
Ibserv^tion Thus, even though these models have resulted 
in conservative estimates of f?i and R-t, we can t < f a * al, y 
positive conclusions about the effect of commutativity on 
the system throughput. 

. The comments that were made about the conservative na- 
ture of the estimates from models 5 and b also apphes to 
model 2. These results are summarized in Iabte 5. Even 
though this model has much more system formation than 
models 5 and 6, the results («, and II,) are not very differ- 
ent. However, the effect of commutativity can now be seen 

at m > 0.95. 

• Having observed that the effect of commutativity is almost 
lo!t fof smaller values of m in models 2-6, we will now look 
at its effect with model 1. These results are summarized 
in Table 6. Even at small values of in, the effect of com 
mutativity on the throughput is evident. In addition, it 
increases with m. This observation holds at both small 
and large values of c. 

• In Table 7, we summarize the effect of variations in num- 
ber of copies. In Tables 1-6, we assumed that each data 
item has exactly the same number of copies. I his is moie 
relevant to Model I. Thus we only consider this model m 
determining Hu- effect of copy variations on evalua lion of «, 
and III. As shown in this t.d.le, the elfis t is s,g, lu ant. As 
the variation in iiumhcr of copies is increased, the miml.<r 
of successful transactions prior lo merge decreases^ erne 
the number of conflicts are also reduced. I his icsults 


*• (? and /?> AS lone; as the variations are 
^vS^,nfi"nri the^lfe^ncestre a, so not significant. 

6. CONCLUSIONS 

In this paper we have introduced the problem of estimating 

These investigations have resulted in some very interesting ob- 

i„ ...» 

paper. In this section, we will summarize our conclusions from 

these investigations. 

We now summarize these conclusions. 

. Random data models that assume only average information 
about the system result m very conservative estimates ol 
system throughput. One has to he very cautious in inter 
preting these results. 

. Adding more system information does not necessarily lead 
to better approximations. In this paper the system infor- 
mational increased from model 6 to mode 12. ^ven though 
this increases the computational complexity, it . 
result in any significant improvement m the estimation of 
number of rollbacks. 

. Model 1 represents a specific system Here, we define the 
transactions completely. Thus it is 

nation. Results (analytical or simulation) obtained Irom 
this model represent actual behavior of the specified sys 
, results oh.ained from such a model are too 

specllic. and ran-. for other systems. 

1 On the other hand, when we look at models 2-0 it is 
possible to conclude that commutativity is not helpful mi- 

disappear m the average U-hav.or. Model 1, on the other 
hand, describes a specific system, and hence can accurately 
compute the rollbacks. It is also able to predict the benefits 
due to commutativity more accurately. 


• The distribution of number of copies seems to affect the 
evaluations significantly. Thus, accurate modeling of this 
distribution is vital to evaluation of rollbacks. 

In addition to developing several system models ^valua- 
tion techniques for these models, this paper has one significant 
contribution to the modeling, simulation, and performance anal- 
ysis community. 

If an abstract system model with average information is 
employed to evaluate the effectiveness of a new technique 
or a new concept, then we should only expect conservative 
estimates of the effects. In other words, if the results from 
the average models are positive, then accept the results. 
If these are negative, then repeat the analysis with a less 
abstracted model. Concepts/techniques that are not ap- 
propriate for an average system may still be applicable lor 
some specific systems. 
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